Broadened-specular reflection and linear subspaces for object recognition

ABSTRACT

The present invention is a method of deriving a reflectance function that analytically approximates the light reflected from an object model in terms of the spherical harmonic components of light. The reflectance function depends upon the intensity of light incident at each point on the model, the intensity of light diffusely reflected, and the intensity of light broadened-specularly reflected in the direction of an observer. This reflectance function is used in the process of machine vision, by allowing a machine to optimize the reflectance function and arrive at an optimal rendered image of the object model, relative to an input image. Therefore, the recognition of an image produced under variable lighting conditions is more robust. The reflectance function of the present invention also has applicability in other fields, such as computer graphics.

BACKGROUND OF THE INVENTION

1. Field of Invention

The present invention relates generally to machine vision, and more particularly, to a method of modeling the broadened-specular reflection of light to allow a machine to recognize the image of an object illuminated by varying lighting conditions.

2. Description of Related Art

It is clear that changes in lighting can have a large effect on how an object, such as a person, looks. Understanding this is essential to building effective object recognition systems.

U.S. patent application Ser. No. 09/705,507, filed 3 Nov. 2000, entitled “Lambertian Reflectance and Linear Subspaces”, now U.S. Pat. No. 6,853,745, the disclosure of which is hereby incorporated by reference, considered the relationship between the function that describes the lighting intensity, and the reflectance function that describes how much light an object reflects as a function of its surface normal, under a given lighting condition. Representing these functions as spherical harmonics, it was shown that for Lambertian reflectance the mapping from lighting to reflectance is a convolution with a nearly perfect low-pass filter. High-frequency components of the lighting hardly affect the reflectance function. Therefore, nearly all Lambertian reflectance functions could be modeled as some linear combination of nine spherical harmonic components.

These low-dimensional approximations have two key advantages. First, they are described with few parameters, thus optimization is simpler. Moreover, the images produced under low-frequency lighting are inherently immune to small changes in orientation. A small rotation in the surface normal is equivalent to a small rotation in the light. However, low-frequency light does not appreciably change with a small rotation. The analysis shows this, in that an image produced by an n^(th) order harmonic light component is computed by taking an n^(th) degree polynomial of the components of the surface normal.

While promising, these results considered only Lambertian reflectance. Specular reflectance can significantly affect how an object appears. For example, a human face may have some oiliness, particularly on the forehead, which reflects light specularly. This is seen as highlights on the object.

Specular reflectance cannot be approximated well by only considering low-dimensional linear subspaces which capture only low-frequency lighting. Intuitively, the set of possible reflectance functions for a perfect mirror is the same as the set of possible lighting functions, as will be shown, infra. The question remains how to cope with specularity in object recognition using models that will inevitably be imperfect.

Specularities are often thought of as mainly present in spatially localized highlights, which occur when a bright light source is reflected by a shiny object. Such highlights can potentially be identified in the image and processed separately. For example, the algorithms described in Coleman, et al., Obtaining 3-Dimensional Shape of Textured and Specular Surfaces Using Four-Source Photometry, Physics-Based Vision Principles and Practice, Shape Recovery, pp. 180–199 (1992), and Brelstaff, et al., Detecting Specular Reflection Using Lambertian Constraints, International Conference on Computer Vision, pp. 297–302 (1988), treat specular pixels as statistical outliers that should be discarded. However, specularities due to low-frequency lighting can be spread out throughout the entire image. Since these specularities affect every pixel in the image, their effect must be better understood; they cannot simply be discarded.

Some previous works have considered specular effects also as convolutions. Basri, et al., (August 2000) considered a version of the Phong model in which the specular lobe is a rotationally symmetric shape centered about a peak that is a 180-degree rotation of the direction of the incoming light about the surface normal. In such a model, specular reflection doubles the frequency of the incoming light, and then acts as a convolution. However, this model is heuristic, and not physically based. It does not extend to realistic models.

Ramamoorthi, et al., On the Relationship between Radiance and Irradiance: Determining the Illumination From Images of a Convex Lambertian Object, to appear in Journal of the Optical Society of America A, considered the effects of arbitrary bi-directional reflectance distribution functions (BRDFs) on the light field. They were able to develop algorithms for recovering the BRDF and/or the lighting given a measured light field and an object with a known 3D structure. Though interesting, these results are not generally applicable to machine vision because, without special sensing devices, the light field is not available. An image of an object is produced by the reflectance function that describes how the surface normals reflect light in one fixed direction. Therefore, an image corresponds to a 2D slice through the light field. For an arbitrary BRDF, the 2D reflectance function for a fixed viewpoint no longer results from a convolution of the lighting.

BRIEF SUMMARY OF THE INVENTION

The goal of the present invention is to allow a machine, such as a computer, to match an input image of an object, such as a human face, to an object model stored in a database associated with the machine, and to perform this matching with a high degree of accuracy.

The steps of the recognition method of the present invention include providing a database of three-dimensional object models, providing an input image, positioning each three dimensional model relative to the input image, determining an optimal rendered image of the model that is most similar to the input image, computing a measure of similarity between the input image and the optimal rendered image of each model, and selecting the object whose optimal rendered image is most similar to the input image. The step of determining an optimal rendered image includes deriving a reflectance function for each object that includes the effects of both broadened-specular and diffuse reflection, and optimizing that reflectance function to most closely match the input image.

The method of the present invention includes rendering the images produced by an object model when illuminated only by each of a plurality of spherical harmonic components of light. The summation of these components defines the reflectance function for the object model. The derived reflectance function takes into account the intensity of light components incident on the object model, the intensity of light diffusely reflected from the model, and the intensity of light specularly reflected from the model, the last component being dependent on the angle between the direction of the observer and the direction of specular reflection.

To overcome the limitations of dealing with specular reflection, a two-pronged strategy is used. The light in low-frequency and higher-frequency components are analyzed separately. The low-frequency components are analyzed using the method of the present invention, and the results of this analysis are evaluated to determine if they are sufficient to match an optimal rendered image of the object model to the input image.

DETAILED DESCRIPTION OF THE INVENTION

Although detailed and/or microscopic models of reflection exist, these apply principally to homogeneous, rigid, material surfaces. For reflection from inhomogeneous, flexible surfaces such as faces, where most of the features of interest are configurational, a more robust model that captures the most important aspects of the scattered light is more appropriate. Thus the Lambertian model, which simply scatters incident radiation equally into the hemisphere above each surface element, is usually chosen.

However, due to broadened-specular reflection, in viewing a person's face, regions of unusually high reflectivity are apparent. The intensity of light reflect from these regions is sensitive to the viewing direction. As this aspect of the illuminated face cannot be captured by Lambertian scattering, even in principle, the Lambertian model is supplemented with the Phong model. The latter scatters the incoming light preferentially into directions distributed about the direction of purely specular reflection. In addition, the distribution of scattered directions is readily analyzable in the spherical-harmonic representation, in a manner similar to that of the Lambertian model. Appropriately weighted, this additional and more selective scattering should adequately model the human face for most lighting conditions of interest.

Though an enhancement of the Phong model of reflection was chosen, the method of the present invention need not be based upon the Phong model. Any model that depends only on the cosine of the angle between the reflected light and the direction of perfect specular reflection can be treated in the same manner.

While not completely dominant, low-frequency lighting will have a considerable effect on the reflectances due to the broadened-specular component of an object. This is because many situations contain significant low-frequency lighting, such as outdoors on a cloudy day, and many indoor settings.

The method of the present invention shows how to analyze the reflectance function that is produced by specular reflection for more physically realistic BRDFs. This is accomplished by adding more realistic terms to the Phong model. These terms modulate the Phong model by the amount of light that is actually striking a surface patch. The new model allows the development of techniques that will be relevant for more physically realistic models, while avoiding some of the complexity of those models.

The physically realistic model leads to the problem of taking integrals in three terms: the lighting intensity, which is a function of its incoming direction; the specular term, which is a function of the angle between the specular peak and the viewing direction; and the amount of light striking the surface, which is a function of the angle between the incoming light and the surface normal. After expanding these terms in spherical harmonics, this can be rewritten as the integral of the product of three harmonics. This problem has been solved in other areas of research, for example atomic spectroscopy. The solution may be thought of as a generalization of the convolution theorem to this case.

The results can be used to determine the set of images that an object produces when lighting consists of a single harmonic component. Therefore, one can analytically characterize the images an object produces, and determine how the effect of low- and high-frequency lighting on an image varies with the parameters of the Phong model.

Conceptually, the method of deriving the reflectance function of an object model according present invention involves transforming the function of light intensity incident upon an object into its low-dimensional spherical harmonic components. In the present sense, low-dimensional is used to mean the 9D subspace formed by the summation of the 0^(th), 1^(st), and 2^(nd) order spherical harmonic components. Each component is applied to the surface normal at each point on the object. Using the reflectance model described herein, the reflectance function is determined, and described in terms of the same spherical harmonic components. These harmonic components can then be summed to produce a rendered image, a linear representation of the image of the object model. As a practical matter, however, the machine can utilize the harmonic components and optimize their weightings to determine if the image received is a match to some known object in its database.

The technique of analyzing the reflectance function in terms of spherical harmonic components follows.

Given L({circumflex over (n)}_(l)), the intensity of light incident in the −{circumflex over (n)}_(i) if direction on a perfectly reflecting, smooth surface with outward pointing normal {circumflex over (n)}, the reflected light r_(S)({circumflex over (n)}_(e),{circumflex over (n)}) in direction {circumflex over (n)}_(e) can be expressed as r _(S)({circumflex over (n)} _(e) ,{circumflex over (n)})=u({circumflex over (n)} _(e) ·{circumflex over (n)})∫d{circumflex over (n)} _(i) L({circumflex over (n)} _(i))δ({circumflex over (n)} _(e) −{circumflex over (n)} _(r)) where {circumflex over (n)}_(r)=2{circumflex over (n)}({circumflex over (n)}_(i)·{circumflex over (n)})−{circumflex over (n)}_(i) is the direction of specular reflection corresponding to incoming light along −{circumflex over (n)}_(i). (Note {circumflex over (n)}_(r) lies in the ({circumflex over (n)},{circumflex over (n)}_(i)) plane, {circumflex over (n)}_(r)·{circumflex over (n)}_(r)=1, and {circumflex over (n)}·{circumflex over (n)}_(r)={circumflex over (n)}·{circumflex over (n)}_(i). Here u(x) is the unit step function: u(x)=0,x<0,u(x)=1,x>0.) The prefactor u({circumflex over (n)}_(ė)·{circumflex over (n)}) ensures that the orientation {circumflex over (n)} of the surface permits the reflection of light in direction {circumflex over (n)}_(e). As ({circumflex over (n)}_(e)−{circumflex over (n)}_(r))=R({circumflex over (n)}′_(e)−{circumflex over (n)}_(i)), where {circumflex over (n)}′_(e)=2{circumflex over (n)}({circumflex over (n)}_(e)·{circumflex over (n)})−{circumflex over (n)}_(e) and R is a rotation of π about {circumflex over (n)}, δ({circumflex over (n)}_(e)−{circumflex over (n)}_(r))=δ({circumflex over (n)}′_(e)−{circumflex over (n)}_(i)), and since

${{\delta\left( {{\hat{n}}_{1} - {\hat{n}}_{2}} \right)} = {\sum\limits_{l\; m}{{Y_{l\; m}\left( {\hat{n}}_{1} \right)}{Y_{l\; m}^{*}\left( {\hat{n}}_{2} \right)}}}},$ where Y_(lm)({circumflex over (n)}) is the lm spherical harmonic evaluated in direction {circumflex over (n)}, r_(S)({circumflex over (n)}_(e),{circumflex over (n)}) can be expressed compactly as

${{r_{s}\left( {{\hat{n}}_{e},\hat{n}} \right)} = {{u\left( {{\hat{n}}_{e} \cdot \hat{n}} \right)}{\sum\limits_{l\; m}{L_{l\; m}{Y_{l\; m}\left( {{2{\hat{n}\left( {{\hat{n}}_{e} \cdot \hat{n}} \right)}} - {\hat{n}}_{e}} \right)}}}}},$ where the L_(lm)=∫d{circumflex over (n)}_(i)L({circumflex over (n)}_(i))Y*_(lm)({circumflex over (n)}_(i)) are the components (strengths) of the incident intensity expressed in spherical harmonics. Choosing {circumflex over (n)}_(e) to be the {circumflex over (z)} axis, and noting that {circumflex over (n)}′_(e)=R{circumflex over (n)}_(e) is now simply the rotation of the {circumflex over (z)} axis about {circumflex over (n)} by π, Y_(lm)(R{circumflex over (z)}) becomes Y_(lm)(β,α), where α and β are the first two Euler angles of R. This is discussed in more detail below in connection with the Phong model. Note especially that as the polar angle of the surface normal {circumflex over (n)} varies from 0 to π/2, the polar angle β varies from 0 to π, as expected for perfect reflection.

For comparison, one can derive the reflected light from a Lambertian surface r_(L)({circumflex over (n)}_(e),{circumflex over (n)}) in a similar manner. r _(L)({circumflex over (n)} _(e) ,{circumflex over (n)})=u({circumflex over (n)} _(e) ·{circumflex over (n)})∫d{circumflex over (n)} _(i) L({circumflex over (n)} _(i))({circumflex over (n)}·{circumflex over (n)} _(i)

0) where (A

B)=max(A, B). A Lambertian surface scatters the incident light equally in all directions {circumflex over (n)}_(e) such that ({circumflex over (n)}_(e)·{circumflex over (n)})>0. The ({circumflex over (n)}·{circumflex over (n)}_(i)) factor weights the incident flux density in the −{circumflex over (n)}_(i) direction. Since

${L\left( {\hat{n}}_{i} \right)} = {\sum\limits_{l\; m}{L_{l\; m}{Y_{l\; m}\left( {\hat{n}}_{i} \right)}}}$ and

$\left( {\hat{n} \cdot {{\hat{n}}_{i}\bigvee 0}} \right) = {\sum\limits_{l\; m}{\left( {4{\pi/\left( {{2l} + 1} \right)}} \right)^{1/2}k_{l}{Y_{l\; m}^{*}\left( {\hat{n}}_{i} \right)}{Y_{l\; m}\left( \hat{n} \right)}}}$ one can obtain the Basri-Jacobs(B-J) Expression:

${r_{L}\left( {{\hat{n}}_{e},\hat{n}} \right)} = {{u\left( {{\hat{n}}_{e} \cdot \hat{n}} \right)}{\sum\limits_{l\; m}{\left( {L_{l\; m}\left( {4{\pi/\left( {{2l} + 1} \right)}} \right)} \right)^{1/2}k_{l}{Y_{l\; m}\left( \hat{n} \right)}}}}$ The k_(l) coefficients calculated by B-J fall off as l⁻², and vanish for l≧3, if l is odd. This, coupled with constraints on the size of the L_(lm) (as L({circumflex over (n)}_(i))≧0) enabled them to truncate the sum over l at l=2.

Such a truncation is clearly not possible for specular reflection; however, apart from mirrors, such reflection is seldom present. One model which retains much of the directionality of specular reflection, but introduces a controlled amount of lateral diffusion is the Phong Model, in terms of which one can write r _(p)({circumflex over (n)} _(e) ,{circumflex over (n)})=u({circumflex over (n)} _(e) ·{circumflex over (n)})∫d{circumflex over (n)} _(i) L({circumflex over (n)} _(i))({circumflex over (n)}·{circumflex over (n)} _(i)

0)({circumflex over (n)} _(e) ·{circumflex over (n)} _(r)

0)^(p) where as above {circumflex over (n)}_(r)=2{circumflex over (n)}({circumflex over (n)}_(i)·{circumflex over (n)})−{circumflex over (n)}_(i). Here the last term distributes the incident light into cones of decreasing intensity about the nominal direction of (specular) reflection {circumflex over (n)}_(r): the larger the exponent p, the sharper the distribution. This expression reduces to r_(L) in the limit that p→0. As such it may over-emphasize the diffuseness of the scattering. Also, Phong scattering is intended to augment rather than replace Lambertian scattering. An expression that reduces to r_(S) in the limit p→∞ is: r′ _(p)({circumflex over (n)} _(e) ,{circumflex over (n)})=u({circumflex over (n)} _(e) ·{circumflex over (n)})∫d{circumflex over (n)} _(i) L({circumflex over (n)} _(i))u({circumflex over (n)}·{circumflex over (n)} _(i))({circumflex over (n)} _(e) ·{circumflex over (n)} _(r)

0)^(p) which may overemphasize the specularity of the scattering. Fortunately the analysis is valid for both: one need only substitute a k_(l) ⁽⁰⁾ for k_(l) to obtain r′_(p) from r_(p).

In the above manner one can write

${L\left( {\hat{n}}_{i} \right)} = {\sum\limits_{l\; m}{L_{l\; m}{Y_{l\; m}\left( {\hat{n}}_{i} \right)}}}$ $\left( {\hat{n} \cdot {{\hat{n}}_{i}\bigvee 0}} \right) = {\sum\limits_{l_{1}\; m_{1}}{\left( {4{\pi/\left( {{2l_{1}} + 1} \right)}} \right)^{1/2}k_{l_{1}}{Y_{l_{1}\; m_{1}}\left( \hat{n} \right)}{Y_{l_{1}\; m_{1}}^{*}\left( {\hat{n}}_{i} \right)}}}$ ${u\left( {\hat{n} \cdot {\hat{n}}_{i}} \right)} = {\sum\limits_{l_{1}\; m_{1}}{\left( {4{\pi/\left( {{2l_{1}} + 1} \right)}} \right)^{1/2}k_{l_{1}}^{(0)}{Y_{l_{1}\; m_{1}}\left( \hat{n} \right)}{Y_{l_{1}\; m_{1}}^{*}\left( {\hat{n}}_{i} \right)}}}$ $\left( {{\hat{n}}_{e} \cdot {{\hat{n}}_{r}\bigvee 0}} \right)^{p} = {\sum\limits_{l_{2}\; m_{2}}{\left( {4{\pi/\left( {{2l_{2}} + 1} \right)}} \right)^{1/2}k_{l_{2}}^{(p)}{Y_{l_{2}\; m_{2}}\left( {\hat{n}}_{e} \right)}{Y_{l_{2}\; m_{2}}^{*}\left( {\hat{n}}_{r} \right)}}}$ where k_(l) is calculated in B-J Appendix A, and k_(l) ^((p)) in B-J Appendix C, where α is used in place of p. k_(l) ⁽⁰⁾ can be obtained analogously, and is given below, along with k_(l); k_(l) ^((p)) is given below as well. Combining one can obtain

${r_{P}\left( {{\hat{n}}_{e},\hat{n}} \right)} = {{u\left( {{\hat{n}}_{e} \cdot \hat{n}} \right)}{\sum\limits_{l\; m}{L_{l\; m} \cdot {\sum\limits_{l_{1}\; m_{1}}{\left( {4{\pi/\left( {{2l_{1}} + 1} \right)}} \right)^{1/2}k_{l_{1}}{{Y_{l_{1}\; m_{1}}\left( \hat{n} \right)} \cdot {\sum\limits_{l_{2}\; m_{2}}{\left( {4{\pi/\left( {{2l_{2}} + 1} \right)}} \right)^{1/2}k_{l_{2}}^{(p)}{{Y_{l_{2}\; m_{2}}\left( {\hat{n}}_{e} \right)} \cdot {\int{{\mathbb{d}{\hat{n}}_{i}}{Y_{l\; m}\left( {\hat{n}}_{i} \right)}{Y_{l_{1}\; m_{1}}^{*}\left( {\hat{n}}_{i} \right)}{Y_{l_{2}\; m_{2}}^{*}\left( {\hat{n}}_{r} \right)}}}}}}}}}}}}$ where the remaining integral over incident direction ({circumflex over (n)}_(i)) is independent of ({circumflex over (n)}_(e)). For the purposes of the above

$\begin{matrix} {k_{l} = \left\{ \begin{matrix} {\pi^{1/2}/2} & {l = 0} & \; \\ \left( {\pi/3} \right)^{1/2} & {l = 1} & \; \\ \frac{\left( {- 1} \right)^{1 + {l/2}}\left( {\left( {{2l} + 1} \right)\pi} \right)^{1/2}{\left( {l - 2} \right)!}}{2^{l}{\left( {{l/2} - 1} \right)!}{\left( {{l/2} + 1} \right)!}} & {{l \geq 2},} & {e\; v\; e\; n} \\ 0 & {{l \geq 2},} & {o\; d\; d} \end{matrix} \right.} \\ {k_{l}^{(0)} = \left\{ \begin{matrix} \pi^{1/2} & {l = 0} & \; \\ \frac{\left( {- 1} \right)^{{({1 + l})}/2}\left( {\left( {{2l} + 1} \right)\pi} \right)^{1/2}{\left( {l - 1} \right)!}}{2^{l}{\left( {\left( {l - 1} \right)/2} \right)!}{\left( {\left( {l + 1} \right)/2} \right)!}} & {{l > 0},} & {o\; d\; d} \\ 0 & {{l > 0},} & {e\; v\; e\; n} \end{matrix} \right.} \\ {a\; n\; d} \\ {k_{l}^{(p)} = \left\{ \begin{matrix} 0 & {{p < l},\left( {l - p} \right)} & {e\; v\; e\; n} \\ \frac{\left( {\left( {{2l} + 1} \right)\pi} \right)^{1/2}\left( {- 1} \right)^{{({l + p + 1})}/2}{p!}{\left( {l - p - 1} \right)!}}{2^{l}{\left( {\left( {l - p - 1} \right)/2} \right)!}{\left( {\left( {l + p + 1} \right)/2} \right)!}} & {{p < l},\left( {l - p} \right)} & {o\; d\; d} \\ \frac{\left( {\left( {{2p} + 1} \right)\pi} \right)^{1/2}2^{p - 1}{p!}{\left( {p - 1} \right)!}}{\left( {{2p} + 1} \right){\left( {{2p} - 1} \right)!}} & {p = l} & \; \\ \frac{\left( {\left( {{2l} + 1} \right)\pi} \right)^{1/2}2^{l + 1}{p!}{\left( {\left( {p + l} \right)/2} \right)!}}{\left( {p - l} \right){\left( {p + l + 1} \right)!}{\left( {{\left( {p - l} \right)/2} - 1} \right)!}} & {{p > l},\left( {p - l} \right)} & {e\; v\; e\; n} \\ \frac{\left( {\left( {{2l} + 1} \right)\pi} \right)^{1/2}{p!}{\left( {\left( {p - l - 1} \right)/2} \right)!}}{2^{l + 1}{\left( {p - l} \right)!}{\left( {\left( {p + l + 1} \right)/2} \right)!}} & {{p > l},\left( {p - l} \right)} & {o\; d\; d} \end{matrix} \right.} \end{matrix}$

Without loss of generality one can take the direction of observation −{circumflex over (n)}_(e) to be the −{circumflex over (z)} axis of our coordinate system, as is commonly chosen. This is also the system in which the lighting is specified. As Y_(lm)({circumflex over (z)}) is zero unless m=0, Y_(l0)({circumflex over (n)}_(r)) must be expressed in terms of Y_(lm)({circumflex over (n)}_(i)) in order to carry out the final integral over {circumflex over (n)}_(i).

Proceeding as follows: {circumflex over (n)}_(r)=2{circumflex over (n)}({circumflex over (n)}_(i)·{circumflex over (n)})−{circumflex over (n)}_(i)=R{circumflex over (n)}_(i) represents a rotation R of {circumflex over (n)}_(i) around by {circumflex over (n)} by π (or by −π). Though it may intuitively seem more natural to view the relation between {circumflex over (n)}_(i) and {circumflex over (n)}_(r) as a rotation in the plane of incidence defined by {circumflex over (n)}_(i) and {circumflex over (n)} about an axis through {circumflex over (n)}_(i)×{circumflex over (n)}, using R instead effects numerous simplifications. From the relation between {circumflex over (n)}_(r) and {circumflex over (n)}_(i) it is clear that R=2{circumflex over (n)}{circumflex over (n)} ^(T) −I whose representation in terms of Euler angles can be found by first equating R₃₃ entries cos β=R ₃₃=2n _(z) ²−1, sin β=2|n _(z)|(1−n _(z) ²)^(½) and then equating R₃₁ entries R ₃₁=2n _(z) n _(x)=sin β cos α; cos α=n _(z) n _(x) /|n _(z)|(n _(x) ² +n _(y) ²)^(½)

Since R is symmetric, cos γ=−cos α and sin γ=sin α. Thus γ=(π−α). Also, sin α=n_(z)n_(y)/|n_(z)|n_(x) ²+n_(y) ²)^(½). (Here (α,β, γ) are the three Euler angles representing R, and {circumflex over (n)}=(n_(x),n_(y),n_(z)). Strictly speaking, the Euler angles refer to a rotation of the coordinates axes while the vectors remain fixed. Hence, one should be concerned with representing R^(T); however, since R is symmetric, R=R^(T).) Since {circumflex over (z)}·{circumflex over (n)}>0,n_(z)>0, and n_(z)/|n_(z)|=1.

Now, to evaluate Y_(l0)(R{circumflex over (n)}_(i)), the polar angle θ_(r) between {circumflex over (n)}_(i) and the rotated {circumflex over (z)}-axis is needed. However, in the original system, the rotated {circumflex over (z)}-axis lies in the {circumflex over (n)}_(βα)=(θ=β,φ=α) direction, and {circumflex over (n)}_(i) is unchanged. Hence cos θ_(r)={circumflex over (n)}_(i)·{circumflex over (n)}_(βα) and, as can be shown

${Y_{l0}\left( {\hat{n}}_{r} \right)} = {{Y_{l0}\left( {R{\hat{n}}_{i}} \right)} = {\left( {4{\pi/\left( {{2l} + 1} \right)}} \right)^{1/2}{\sum\limits_{m}{{Y_{l\; m}^{*}\left( {\beta,\alpha} \right)}{Y_{l\; m}\left( {\hat{n}}_{i} \right)}}}}}$ The expression for the reflected light based upon the Phong model then becomes

${r_{P}\left( {\hat{z},\hat{n}} \right)} = {{u\left( {\hat{z} \cdot \hat{n}} \right)}{\sum\limits_{l\; m}{L_{l\; m} \cdot {\sum\limits_{l_{1}\; m_{1}}{\left( {4{\pi/\left( {{2l_{1}} + 1} \right)}} \right)^{1/2}k_{l_{1}}{{Y_{l_{1}\; m_{1}}\left( \hat{n} \right)} \cdot {\sum\limits_{l_{2}\; m_{2}}{\left( {4{\pi/\left( {2_{l_{2}} + 1} \right)}} \right)^{1/2}k_{l_{2}}^{(p)}{{Y_{l_{2}\; m_{2}}\left( {\beta,\alpha} \right)} \cdot {\int{{\mathbb{d}{\hat{n}}_{i}}{Y_{l\; m}\left( {\hat{n}}_{i} \right)}{Y_{l_{1}\; m_{1}}^{*}\left( {\hat{n}}_{i} \right)}{Y_{l_{2}\; m_{2}}^{*}\left( {\hat{n}}_{i} \right)}}}}}}}}}}}}$ It remains to evaluate the latter integral.

The latter integral can be readily performed if the product of the second pair of spherical harmonics can be written as a sum of spherical harmonics. Such problems were solved in the context of the theory of angular momentum, a key component of present understanding of atomic spectroscopy. The result for this integral is that it is equal to ((2l ₁+1)(2l ₂+1)/4π(2l+1))^(½) C(l ₁ l ₂ l;m ₁ m ₂ m)C(l ₁ l ₂ l;000) where the C are Clebsch-Gordon (C-G) coefficients; simple, real numbers, readily calculable, and equal to zero unless m=m₁+m₂,|m₁|≦l₁,|m₂|≦l₂,|m|≦l, and l equals one of (l₁+l₂),(l₁+l₂−1), . . . ,|l₁−l₂|. This is known from previous work in angular momentum theory.

The next step is to consider, based on the values of k_(l) ₁ and k_(l) ₂ ^((p)), which values of the C-G coefficients will be necessary in evaluating the sums over (l₁,m₁,l₂,m₂). Again, from work in angular momentum theory, the following is an expression for the C-G coefficients:

${C\left( {{l_{1}l_{2}l};{m_{1}m_{2}m}} \right)} = {\delta_{m,{m_{1} + m_{2}}} \cdot \left( {\left( {{2l} + 1} \right)\frac{{\left( {l + l_{1} - l_{2}} \right)!}{\left( {l - l_{1} + l_{2}} \right)!}{\left( {l_{1} + l_{2} - l} \right)!}{\left( {l + m} \right)!}{\left( {l - m} \right)!}}{{\left( {l_{1} + l_{2} + l + 1} \right)!}{\left( {l_{1} - m_{1}} \right)!}{\left( {l_{1} + m_{1}} \right)!}{\left( {l_{2} - m_{2}} \right)!}{\left( {l_{2} + m_{2}} \right)!}}} \right)^{1/2} \cdot {\sum\limits_{q}{\left( {- 1} \right)^{q + l_{2} + m_{2}}\frac{{\left( {l_{2} + l + m_{1} - q} \right)!}{\left( {l_{1} - m_{1} + q} \right)!}}{{q!}{\left( {l - l_{1} + l_{2} - q} \right)!}{\left( {l + m - q} \right)!}{\left( {q + l_{1} - l_{2} - m} \right)!}}}}}$ where q are the integers satisfying 0

(l ₂ −l ₁ +m)≦q≦(l+m)

(l ₂ −l ₁ +l) where A

B=max(A,B) and C

D=min(C,D). As usual, the convention is that 0!=1. Other useful facts about C-G coefficients are that C(l₁0l;m₁0m)=δ_(l) ₁ _(l)δ_(m) ₁ _(m), and that C(l₁l₂l;000)=0 unless (l₁+l₂+l) is even.

Consider the 1=0, m=0 factor (of L₀₀) in r_(p)({circumflex over (z)},{circumflex over (n)}), the d.c. response, so to speak. Since l=0, l₁=l₂, and as m=0, m₂=−m₁={overscore (m)}₁ one can either evaluate C(l ₁ l ₂0;m ₁ m ₂0)=δ_(l) ₁ _(l) ₂ δ_(m) ₁ _({overscore (m)}) ₂ (−1)^(l) ¹ ^(−m) ₁ /(2l ₁+1)^(½) or perform the above integration over incident-light directions ∫d{circumflex over (n)} _(i) Y ₀₀({circumflex over (n)} _(i))Y* _(l) ₁ _(m) _(1*) ({circumflex over (n)} _(i))Y _(l) ₂ _(m) ₂ ({circumflex over (n)} _(i))=(−1)^(m) ¹ (4π)^(−½)δ_(l) ₁ _(l) ₂ δ_(m) ₁ _({overscore (m)}) ₂ to obtain for l=0,m=0:

${r_{P}\left( {\hat{z},\hat{n}} \right)}_{00} = {{u\left( {\hat{z} \cdot \hat{n}} \right)}{L_{00}\left( {4\pi} \right)}^{1/2}{\sum\limits_{l_{1} = 0}{k_{l_{1}}{k_{l_{1}}^{(p)}\left( {{2l_{1}} + 1} \right)}^{- 1}{\underset{m_{1} = {- l_{1}}}{\sum\limits^{l_{1}}}{{Y_{l_{1}m_{1}}\left( \hat{n} \right)}{Y_{l_{1}m_{1}}^{*}\left( {\beta,\alpha} \right)}}}}}}$ Thus, unlike the cases of Lambertian and specular reflection, even for uniform lighting conditions, the reflected light is a function of the surface normal {circumflex over (n)}. This is in addition to the effect of the nonlinearity due to the u({circumflex over (z)}·{circumflex over (n)}) term, which enters in all cases. Do note, however, that k_(l) ₁ decreases as 1/l₁ ², while k_(l) ₁ ^((p)) changes comparatively slowly until

${l_{1} > p^{\frac{1}{2}}},$ where upon it falls roughly as exp−l₁ ²/p reaching 2^(−p) by l₁=p, and then falls off as 1/l₁ ^(1+p) for l₁>p. Thus, the major {circumflex over (n)}-dependent term in the (1=0, m=0)-component of r_(P) will be the l₁=1 term in the above sum. (The sum over m₁ is, of course, real.) For the case of r′_(p),k_(l) ₁ ⁽⁰⁾ falls off as 1/l₁.

One might wonder how, if the lumination is uniform in solid angle (L_(lm)=0, l>0), one could possibly have an angular dependence in the reflected light. However, any scattering which depends on, and only on, the angle to the surface normal will exhibit a non-uniform distribution when viewed from different directions. For example, if light were scattered only in the direction of the surface normal, regardless of incident angle, then a sphere would appear dark except for an intense bright emission from the center of the apparent disk. This is easily seen analytically. Choosing {circumflex over (n)}_(e)={circumflex over (z)}, the expressions for r_(S),r_(L),r_(P) have the generic from

${r\left( \hat{n} \right)} = {{\int{{\mathbb{d}{\hat{n}}_{i}}{L\left( \hat{n} \right)}{g\left( {{\hat{n}}_{i},\hat{n}} \right)}}} = {{\sum\limits_{l\; m}{L_{l\; m}{G_{l\; m}\left( \hat{n} \right)}g\left( {{\hat{n}}_{i},\hat{n}} \right)}} = {\sum\limits_{l\; m}{{G_{l\; m}\left( \hat{n} \right)}{Y_{l\; m}^{*}\left( {\hat{n}}_{i} \right)}}}}}$

Thus G₀₀=∫d{circumflex over (n)}_(i)g({circumflex over (n)}_(i),{circumflex over (n)})/(4π)^(½) and since g_(P) depends on ({circumflex over (n)}_(i)·{circumflex over (n)}), ({circumflex over (n)}_(e)·{circumflex over (n)}_(i)), and ({circumflex over (n)}_(e)·{circumflex over (n)}), the integral over {circumflex over (n)}_(l), (4π)^(½)G₀₀, will be a function of ({circumflex over (n)}·{circumflex over (n)}_(e))=({circumflex over (n)}·{circumflex over (z)}). Hence, although G₀₀ is independent of the azimuthal orientation of {circumflex over (n)}, in most cases it will depend on {circumflex over (n)}'s polar orientation. Since g_(L) depends only on ({circumflex over (n)}_(i)·{circumflex over (n)}), G₀₀ for the Lambertian model is independent of {circumflex over (n)}; the tight coupling between the incident and reflected light in specular reflection yields G₀₀=Y₀₀({circumflex over (n)}′_(e))=(4π)^(−½) a constant. However, for most g({circumflex over (n)}_(i),{circumflex over (n)}), G₀₀=G₀₀({circumflex over (n)}·{circumflex over (z)}). Without the ({circumflex over (n)}_(i)·{circumflex over (n)}) constraint, this directional dependence can be suppressed; however, scattering from very oblique surfaces can be enhanced as much as a factor of 2.

See also the coefficients of L_(1m) and L_(2m) in the expression for r_(p)({circumflex over (z)},{circumflex over (n)}). For l=1, the only possible values of (l₁, l₂) are l₁=(l₂+1) and l₂=(l₁+1), (l₁

l₂) starting from zero: l₁=l₂ is out as l₁+l₂+l=2l₁+l, which, being odd, renders C(l₁l₁l; 000)=0; and l₁=l₂+n or l₂=l₁+n yields a smallest l of n, hence n≧2 are out for l=1. Also, as C(l ₁ l ₂ l;m ₁ m ₂ m)=(−1)^(l) ¹ ^(+l) ² ^(−l) C(l ₁ l ₂ l;m ₁ m ₂ m) (exchange), one can easily obtain the l₁=(l₂+1) case. Using Rose's expression, find:

$C\left( {{{l_{1}\left( {l_{1} + l} \right)}1};{{{m_{1}\left( {m - m_{1}} \right)}m} = {\left( {- 1} \right)^{l_{1} - m_{1}}\left( \frac{6{\left( {2l_{1}} \right)!}{\left( {1 + m + l_{1} - m_{1}} \right)!}{\left( {1 - m + l_{1} + m_{1}} \right)!}}{{\left( {{2l_{1}} + 3} \right)!}{\left( {l_{1} - m_{1}} \right)!}{\left( {l_{1} + m_{1}} \right)!}{\left( {1 + m} \right)!}{\left( {1 - m} \right)!}} \right)^{1/2}}}} \right.$ in which, −l₁≦m₁≦l₁ and −1≦m≦1. Using this, one can obtain for the (l=1, m) term in the Phong-model reflectance:

${r_{P}\left( {\hat{z},\hat{n}} \right)}_{1m} = {{u\left( {\hat{z} \cdot \hat{n}} \right)}{L_{1m}\left( {- 1} \right)}^{m}\left( {4{\pi/3}} \right)^{1/2}{\sum\limits_{l1}{\frac{6\left( {l_{1} + 1} \right)}{\left( {{2l_{1}} + 3} \right)\left( {{2l_{1}} + 2} \right)\left( {{2l_{1}} + 1} \right)} \cdot {\sum\limits_{m_{1}}{\left( {{k_{l_{1}}k_{l_{1} + 1}^{(p)}{Y_{l_{1}m_{1}}\left( \hat{n} \right)}{Y_{l_{1} + {1m_{1}} - m}^{*}\left( {\beta,\alpha} \right)}} + {k_{l_{1} + 1}k_{l_{1}}^{(p)}{Y_{l_{1} + {1m_{1}} - m}^{*}\left( \hat{n} \right)}{Y_{l_{1}m_{1}}\left( {\beta,\alpha} \right)}}} \right) \cdot \left( \frac{{\left( {1 + m + l_{1} - m_{1}} \right)!}{\left( {1 - m + l_{1} + m_{1}} \right)!}}{{\left( {l_{1} - m_{1}} \right)!}{\left( {l_{1} + m_{1}} \right)!}{\left( {1 + m} \right)!}{\left( {1 - m} \right)!}} \right)^{1/2}}}}}}$ In this (l=1) expression, the sum on l₁ runs from l₁=0, and the sum on m₁ runs from m₁=−l₁ to m₁=+l₁. As with l=0, the terms in the l₁ sum fall off as 1/l₁ ^(3+p) for l₁>p.

For l=2, the Clebsch-Gordon coefficients are more complicated. We will simply state the result for this component of the reflectance. Note that only l₂=l₁+2, l₁=l₂+2, and l₁=l₂ must be considered, and that the first two are, or course, closely related. The third case is more involved and requires separate expressions for m=±2, m=±1, and m=0.

${{r_{P}\left( {\hat{z},\hat{n}} \right)}_{2m} = {{u\left( {\hat{z} \cdot \hat{n}} \right)}{L_{2m}\left( {- 1} \right)}^{m}\left( {4{\pi/5}} \right)^{1/2}\left( {{\sum\limits_{l1}{\frac{60{\left( {2l_{1}} \right)!}{\left( {l_{1} + 2} \right)!}}{{\left( {{2l_{1}} + 5} \right)!}{l_{1}!}} \cdot {\sum\limits_{m_{1}}{\left( {{k_{l_{1}}k_{l_{1} + 2}^{(p)}{Y_{l_{1}m_{1}}\left( \hat{n} \right)}{Y_{l_{1} + {2m_{1}} - m}^{*}\left( {\beta,\alpha} \right)}} + {k_{l_{1} + 2}k_{l_{1}}^{(p)}{Y_{l_{1} + {2m_{1}} - m}^{*}\left( \hat{n} \right)}{Y_{l_{1}m_{1}}\left( {\beta,\alpha} \right)}}} \right) \cdot \left( \frac{{\left( {2 + m + l_{1} - m_{1}} \right)!}{\left( {2 - m + l_{1} + m_{1}} \right)!}}{{\left( {2 + m} \right)!}{\left( {2 - m} \right)!}{\left( {l_{1} + m_{1}} \right)!}{\left( {l_{1} - m_{1}} \right)!}} \right)^{1/2}}}}} + {\sum_{l_{1}}^{\prime}{k_{l_{1}}k_{l_{1}}^{(p)}\frac{30{\left( {{2l_{1}} - 2} \right)!}}{\left( {{2l_{1}} + 3} \right)!}{\left( \left( {\left( {l_{1} - 1} \right){l_{1}\left( {l_{1} + 1} \right)}\left( {l_{1} + 2} \right)} \right) \right)^{1/2} \cdot {\sum_{m_{1}}^{''}{\left( {\left( {l_{1} + m_{1}} \right)\left( {l_{1} + m_{1} - 1} \right)\left( {l_{1} - m_{1} + 2} \right)\left( {l_{1} - m_{1} + 1} \right)} \right)^{1/2} \cdot \left( {{\delta_{m,2}{Y_{l_{1}m_{1}}\left( \hat{n} \right)}{Y_{{l_{1}m_{1}} - 2}^{*}\left( {\beta,\alpha} \right)}} + {\delta_{m,{- 2}}{Y_{l_{1}m_{1}}^{*}\left( \hat{n} \right)}{Y_{{l_{1}m_{1}} - 2}\left( {\beta,\alpha} \right)}}} \right)}}}}} + {\left( {l_{1}\left( {l_{1} + 1} \right)} \right)^{1/2}{\sum_{m_{1}}^{\prime}{\left( {\left( {l_{1} + m_{1}} \right)\left( {l_{1} - m_{1} + 1} \right)} \right)^{1/2}{\left( {1 - {2m_{1}}} \right) \cdot \left( {{\delta_{m,1}{Y_{l_{1}m_{1}}\left( \hat{n} \right)}{Y_{{l_{1}m_{1}} - 1}^{*}\left( {\beta,\alpha} \right)}} + {\delta_{m,{- 1}}{Y_{l_{1}m_{1}}^{*}\left( \hat{n} \right)}{Y_{{l_{1}m_{1}} - 1}\left( {\beta,\alpha} \right)}}} \right)}}}} + {\frac{2}{3}{l_{1}\left( {l_{1} + 1} \right)}{\sum_{m_{1}}{\left( {{l_{1}\left( {l_{1} + 1} \right)} - {3m_{1}^{2}}} \right)\delta_{m,0}{Y_{l_{1}m_{1}}\left( \hat{n} \right)}{Y_{{l_{1}m_{1}} - 1}\left( {\beta,\alpha} \right)}}}}} \right)\text{1/2}}}\mspace{14mu}$

In this (l=2) expression, the terms in the first (l₂=l₁+2), second (l₂=l₁, m=2), and fourth (l₂=l₁, m=0) l₁-summations all fall off as 1/l₁ ^(3+p), as for l=0,1, while those in the third (l₂=l₁, m=1) fall off as 1/l₁ ^(5+p). The unprimed sum on l₁ is over l₁=0, and the primed sum is over l₁=1. The unprimed sums over m₁ are from −l₁≦m₁≦l₁, the single primed sum is from −l₁+1≦m₁≦l₁, and the doubled prime sum is from −l₁+2≦m₁≦l₁. The expression C(l ₁ l ₂ l;m ₁ m ₂ {overscore (m)})=(−1)^(l) ¹ ^(+l) ² ^(−l) C(l ₁ l ₂ l;{overscore (m)} ₁ {overscore (m)} ₂ m) was useful in the above derivation.

From the foregoing, the function of broadened specular reflectance, r_(P)({circumflex over (z)},{circumflex over (n)}), can be described in terms of a summation of the spherical harmonic components of the light incident upon the model.

It is useful to consider the spherical harmonic components of the reflected light as images of a given model, nine in the exemplary embodiment, each image produced by light reflected from the model when illuminated by only one spherical harmonic component of incident light. The complete set of images that model can produce can be approximated by some linear combination of those nine images. Therefore, matching a model of the object to some input image becomes a matter of optimizing the coefficients of each harmonic component image so that the summation of the components creates a rendered image that most nearly approximates the input image.

The method of deriving the reflectance function described above is used in the object recognition process. To match an input image with an object model in the database, the object model must be positioned to match the position of the input image. There are standard methods of performing this function known in the art, though a preferred method is disclosed in U.S. patent application Ser. No. 09/538,209, filed 30 Mar. 2000, entitled “Method for Matching a Two Dimensional Image to One of a Plurality of Three Dimensional Candidate Models in a Database”, now U.S. Pat. No. 6,956,569, the disclosure of which is hereby incorporated by reference.

Once positioned, images of the model are rendered under each of one or more spherical harmonic components of light as described above. These component rendered images can be linearly combined to arrive at an optimal rendered image of the model. An optimal rendered image means a rendered image that most closely approximates the input image. The comparison of the various rendered images to the input image is made by calculating the Euclidian Distance between the images, in a manner that is known in the art. The optimization is performed by a least squares fit technique between the vector that is the reflectance function and the matrix that represents the input image, also known in the art.

Other steps follow the optimization in the recognition process. One is computing some measure of similarity between the input image and the optimal rendered image for each model. This is accomplished be measuring the Euclidian Distance between each optimal image and the input image. The object model whose optimal rendered image is nearest the input image, assuming some threshold value of similarity must be reached, is a match to the input image.

While the reflectance function of the present invention is described in application to the field of machine vision, it is not limited to that field. There are numerous parallels between this work and advances in the field of computer graphics, and our invention has applicability there as well. The field of computer graphics is continually striving to produce more realistic rendered images based on object models that exist only in the computer. The reflectance function of the present invention advances that work as well.

Our invention has been described herein with reference to a particular exemplary embodiment. Certain alterations and modifications may be apparent to those skilled in the art, without departing from the scope of the invention. The exemplary embodiment is not meant to be limiting on the scope of the invention, which is defined by the appended claims. 

1. A method of recognizing objects under various lighting conditions comprising: (a) providing a database comprising a plurality of three dimensional models, (b) providing an input image, (c) orienting each three dimensional model relative to the input image, (d) determining, for each three dimensional model, a rendered image which is most similar to the input image, said determining step comprising: (i) deriving a reflectance function that describes an approximation of the set of all possible rendered images that each three dimensional model can produce under all possible lighting conditions, said rendered images including both diffusely and specularly reflected light; and (ii) optimizing the reflectance function to determine rendered image of each model that is most similar to the input image; (e) computing a measure of similarity between the input image and each optimal rendered image; and (f) selecting the three dimensional model corresponding to the optimal rendered image whose measure of similarity is most similar to the input image, wherein the reflectance function employs a broadened-specular reflectance model that accounts for the angle between the direction of observation and the direction of perfect specular reflectance, said broadened-specular reflectance model axially symmetric about the axis of perfect specular reflection.
 2. The method according to claim 1 wherein the reflectance function includes a mathematical term to account for the absence of reflected light due to the position of the object model.
 3. A method of image recognition of an input image, the method comprising: (a) receiving a three-dimensional model of a candidate object in a particular orientation; (b) generating a set of harmonic images for the three-dimensional model, the harmonic images forming the basis for a linear subspace, and approximating the specular and scattering reflectance on the candidate object when illuminated by a harmonic light; and (c) selecting a candidate image from the set of harmonic images, the candidate image representing a point in the linear subspace that is nearest to the input image, by seeking a vector of harmonic coefficients that minimizes the difference between the input image and the candidate image.
 4. The method according to claim 3, wherein the candidate image is restricted to a subset of harmonic images that corresponds to physically realizable lighting conditions.
 5. The method according to claim 3, wherein the linear subspace comprises the first four harmonic images.
 6. The method according to claim 3, wherein the linear subspace comprises the first nine harmonic images.
 7. The method according to claim 3, wherein the linear subspace comprises the first eighteen harmonic images.
 8. The method according to claim 3, wherein approximating the specular and scattering reflectance on the candidate object comprises a term dependent upon the cosine of the angle between the reflected light and the direction of perfect specular reflection.
 9. The method according to claim 3, wherein approximating the specular and scattering reflectance on the candidate object comprises a term derived from the Phong model of reflectance. 